在本文中,我们将预处理技术应用于具有不同长度的多通道时间序列数据,我们称之为对齐问题,用于下游机器学习。多种原因可能发生多种渠道时间序列数据的未对准,原因有多种原因,例如丢失的数据,变化的采样率或不一致的收集时间。我们考虑从MIT SuperCloud高性能计算(HPC)中心收集的多渠道时间序列数据,其中不同的工作开始时间和HPC作业的运行时间不同,导致数据不对准。这种未对准使得为计算工作负载分类等任务构建AI/ML方法具有挑战性。在先前使用MIT SuperCloud数据集的监督分类工作的基础上,我们通过三种宽阔的低间接空间方法解决了对齐问题:从全职系列中抽样固定子集,在全职系列上执行摘要统计信息,并对系数进行取样。从映射到频域的时间序列。我们最佳性能模型的分类精度大于95%,以先前的方法对MIT SuperCloud数据集的多通道时间序列分类的表现优于5%。这些结果表明,我们的低间接费用方法与标准机器学习技术结合使用,能够达到高水平的分类准确性,并作为解决对齐问题(例如内核方法)的未来方法的基准。
translated by 谷歌翻译
深度学习模型推断是许多企业和科学发现过程中的关键服务。本文介绍了Ribbon,这是一种新颖的深度学习推理服务系统,符合两个相互竞争的目标:服务质量(QoS)目标和成本效益。功能区背后的关键思想是智能采用各种云计算实例(异质实例)来满足QoS目标并最大程度地节省成本。功能区设计了一种贝叶斯优化驱动的策略,该策略可帮助用户在云计算平台上为其模型推理服务需求构建最佳的异质实例集 - 并且,功能区展示了其优于使用均匀实例池的推理服务系统的优越性。功能区可为不同的学习模型节省多达16%的推理服务成本,包括新兴的深度学习建议系统模型和药物发现的启用模型。
translated by 谷歌翻译
通过一系列联邦举措和命令,美国政府一直在努力确保美国在AI中的领导。这些广泛的战略文件影响了美国空军美国部(DAF)等组织。DAF-MIT AI加速器是DAF和MIT之间的一项计划,以弥合AI研究人员与DAF任务要求之间的差距。DAF-MIT AI加速器支持的几个项目正在开发公共挑战问题,这些问题解决了许多联邦AI研究的重点。这些挑战是通过公开可用的大型AI-Ready数据集,激励开源解决方案,并为可以激发进一步研究的双重使用技术创建需求信号,来针对优先事项。在本文中,我们描述了正在开发的这些公共挑战以及它们的应用如何促进科学进步。
translated by 谷歌翻译
人工智能尚未彻底改变材料和分子的设计。在这种观点中,我们确定了四个障碍,阻碍了原子深度学习,分子科学和高性能计算的整合。我们概述了重点的研究努力,解决这些挑战所提供的机会。
translated by 谷歌翻译
分子和材料科学的深度学习受应用科学,人工智能和高性能计算之间缺乏融合的限制。关于培训数据量,模型架构的规模和复杂程度以及计算基础设施的规模的瓶颈是限制分子和材料深度学习缩放的关键因素。在这里,我们呈现$ \ texit {litmatter} $,轻量级框架用于缩放分子深度学习方法。我们在超过400个GPU上培训四个图形神经网络架构,并调查这些方法的缩放行为。根据模型架构,可以看到高达60美元的培训时间加速。经验神经缩放关系量化模型依赖性缩放,使能最优计算资源分配和可伸缩分子几何深度学习模型实现的识别。
translated by 谷歌翻译
传统的基于频率的投影滤波器,或投影运算符(PO),通过一系列变换单独的信号和噪声,该变换除去存在噪声的频率。然而,该技术依赖于先验的频率包含信号和噪声的知识,并且这些频率不重叠,这难以在实践中实现。为了解决这些问题,我们介绍了PO-Neural网络混合模型,伪投影算子(PPO),其利用神经网络进行频率选择。我们将PPO,PO和DENOISING AUTOENCODER(DAE)的过滤功能与罗切斯特大学的多模态音乐性能数据集进行了各种添加的噪声类型。在大多数实验中,PPO都优于PO和DAE。根据这些结果,我们建议将PPO应用于身体和生物科学中的问题。
translated by 谷歌翻译
We consider a long-term average profit maximizing admission control problem in an M/M/1 queuing system with a known arrival rate but an unknown service rate. With a fixed reward collected upon service completion and a cost per unit of time enforced on customers waiting in the queue, a dispatcher decides upon arrivals whether to admit the arriving customer or not based on the full history of observations of the queue-length of the system. \cite[Econometrica]{Naor} showed that if all the parameters of the model are known, then it is optimal to use a static threshold policy - admit if the queue-length is less than a predetermined threshold and otherwise not. We propose a learning-based dispatching algorithm and characterize its regret with respect to optimal dispatch policies for the full information model of \cite{Naor}. We show that the algorithm achieves an $O(1)$ regret when all optimal thresholds with full information are non-zero, and achieves an $O(\ln^{3+\epsilon}(N))$ regret in the case that an optimal threshold with full information is $0$ (i.e., an optimal policy is to reject all arrivals), where $N$ is the number of arrivals and $\epsilon>0$.
translated by 谷歌翻译
Active target sensing is the task of discovering and classifying an unknown number of targets in an environment and is critical in search-and-rescue missions. This paper develops a deep reinforcement learning approach to plan informative trajectories that increase the likelihood for an uncrewed aerial vehicle (UAV) to discover missing targets. Our approach efficiently (1) explores the environment to discover new targets, (2) exploits its current belief of the target states and incorporates inaccurate sensor models for high-fidelity classification, and (3) generates dynamically feasible trajectories for an agile UAV by employing a motion primitive library. Extensive simulations on randomly generated environments show that our approach is more efficient in discovering and classifying targets than several other baselines. A unique characteristic of our approach, in contrast to heuristic informative path planning approaches, is that it is robust to varying amounts of deviations of the prior belief from the true target distribution, thereby alleviating the challenge of designing heuristics specific to the application conditions.
translated by 谷歌翻译
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency. As the systems grow in complexity, fine-tuning architectural parameters across multiple sub-systems (e.g., datapath, memory blocks in different hierarchies, interconnects, compiler optimization, etc.) quickly results in a combinatorial explosion of design space. This makes domain-specific customization an extremely challenging task. Prior work explores using reinforcement learning (RL) and other optimization methods to automatically explore the large design space. However, these methods have traditionally relied on single-agent RL/ML formulations. It is unclear how scalable single-agent formulations are as we increase the complexity of the design space (e.g., full stack System-on-Chip design). Therefore, we propose an alternative formulation that leverages Multi-Agent RL (MARL) to tackle this problem. The key idea behind using MARL is an observation that parameters across different sub-systems are more or less independent, thus allowing a decentralized role assigned to each agent. We test this hypothesis by designing domain-specific DRAM memory controller for several workload traces. Our evaluation shows that the MARL formulation consistently outperforms single-agent RL baselines such as Proximal Policy Optimization and Soft Actor-Critic over different target objectives such as low power and latency. To this end, this work opens the pathway for new and promising research in MARL solutions for hardware architecture search.
translated by 谷歌翻译
Generalizability of time series forecasting models depends on the quality of model selection. Temporal cross validation (TCV) is a standard technique to perform model selection in forecasting tasks. TCV sequentially partitions the training time series into train and validation windows, and performs hyperparameter optmization (HPO) of the forecast model to select the model with the best validation performance. Model selection with TCV often leads to poor test performance when the test data distribution differs from that of the validation data. We propose a novel model selection method, H-Pro that exploits the data hierarchy often associated with a time series dataset. Generally, the aggregated data at the higher levels of the hierarchy show better predictability and more consistency compared to the bottom-level data which is more sparse and (sometimes) intermittent. H-Pro performs the HPO of the lowest-level student model based on the test proxy forecasts obtained from a set of teacher models at higher levels in the hierarchy. The consistency of the teachers' proxy forecasts help select better student models at the lowest-level. We perform extensive empirical studies on multiple datasets to validate the efficacy of the proposed method. H-Pro along with off-the-shelf forecasting models outperform existing state-of-the-art forecasting methods including the winning models of the M5 point-forecasting competition.
translated by 谷歌翻译